Search for: All records

Creators/Authors contains: "Johnson, W. Evan"

« Prev Next »

Total Resources

4

Resource Type
Conference Paper

1

Conference Proceeding

0

Dataset

0

Journal Article

3

Workshop Report

0

Availability
Full Text / Resource Available

4

Citation Only

0

Save Results
Excel (limit 2000)
CSV (limit 5000)
XML (limit 5000)

Have feedback or suggestions for a way to improve these results?
!

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Development and Validation of a Parsimonious Tuberculosis Gene Signature Using the digital NanoString nCounter Platform

https://doi.org/10.1093/cid/ciac010

Kaipilyawar, Vaishnavi ; Zhao, Yue ; Wang, Xutao ; Joseph, Noyal M. ; Knudsen, Selby ; Prakash Babu, Senbagavalli ; Muthaiah, Muthuraj ; Hochberg, Natasha S. ; Sarkar, Sonali ; Horsburgh, Jr, Charles R. ; et al ( January 2022 , Clinical Infectious Diseases)

Abstract Background
Blood-based biomarkers for diagnosing active tuberculosis (TB), monitoring treatment response, and predicting risk of progression to TB disease have been reported. However, validation of the biomarkers across multiple independent cohorts is scarce. A robust platform to validate TB biomarkers in different populations with clinical end points is essential to the development of a point-of-care clinical test. NanoString nCounter technology is an amplification-free digital detection platform that directly measures mRNA transcripts with high specificity. Here, we determined whether NanoString could serve as a platform for extensive validation of candidate TB biomarkers.
Methods
The NanoString platform was used for performance evaluation of existing TB gene signatures in a cohort in which signatures were previously evaluated on an RNA-seq dataset. A NanoString codeset that probes 107 genes comprising 12 TB signatures and 6 housekeeping genes (NS-TB107) was developed and applied to total RNA derived from whole blood samples of TB patients and individuals with latent TB infection (LTBI) from South India. The TBSignatureProfiler tool was used to score samples for each signature. An ensemble of machine learning algorithms was used to derive a parsimonious biomarker.
Results
Gene signatures present in NS-TB107 had statistically significant discriminative power for segregating TB from LTBI. Further analysis of the data yielded a NanoString 6-gene set (NANO6) that when tested on 10 published datasets was highly diagnostic for active TB.
Conclusions
The NanoString nCounter system provides a robust platform for validating existing TB biomarkers and deriving a parsimonious gene signature with enhanced diagnostic performance.

more » « less
ComBat-seq: batch effect adjustment for RNA-seq count data

https://doi.org/10.1093/nargab/lqaa078

Zhang, Yuqing ; Parmigiani, Giovanni ; Johnson, W Evan ( September 2020 , Nucleic acids research)
null (Ed.)
The benefit of integrating batches of genomic data to increase statistical power is often hindered by batch effects, or unwanted variation in data caused by differences in technical factors across batches. It is therefore critical to effectively address batch effects in genomic data to overcome these challenges. Many existing methods for batch effects adjustment assume the data follow a continuous, bell-shaped Gaussian distribution. However in RNA-seq studies the data are typically skewed, over-dispersed counts, so this assumption is not appropriate and may lead to erroneous results. Negative binomial regression models have been used previously to better capture the properties of counts. We developed a batch correction method, ComBat-seq, using a negative binomial regression model that retains the integer nature of count data in RNA-seq studies, making the batch adjusted data compatible with common differential expression software packages that require integer counts. We show in realistic simulations that the ComBat-seq adjusted data results in better statistical power and control of false positives in differential expression compared to data adjusted by the other available methods. We further demonstrated in a real data example that ComBat-seq successfully removes batch effects and recovers the biological signal in the data.
more » « less
Full Text Available
Robustifying genomic classifiers to batch effects via ensemble learning

https://doi.org/10.1101/703587

Zhang, Yuqing ; Johnson, W Evan ; Parmigiani, Giovanni ( July 2019 , bioRxiv)

Genomic data are often produced in batches due to practical restrictions, which may lead to unwanted variation in data caused by discrepancies across processing batches. Such "batch effects" often have negative impact on downstream biological analysis and need careful consideration. In practice, batch effects are usually addressed by specifically designed software, which merge the data from different batches, then estimate batch effects and remove them from the data. Here we focus on classification and prediction problems, and propose a different strategy based on ensemble learning. We first develop prediction models within each batch, then integrate them through ensemble weighting methods. In contrast to the typical approach of removing batch effects from the merged data, our method integrates predictions rather than data. We provide a systematic comparison between these two strategies, using studies targeting diverse populations infected with tuberculosis. In one study, we simulated increasing levels of heterogeneity across random subsets of the study, which we treat as simulated batches. We then use the two methods to develop a genomic classifier for the binary indicator of disease status. We evaluate the accuracy of prediction in another independent study targeting a different population cohort. We observed a turning point in the level of heterogeneity, after which our strategy of integrating predictions yields better discrimination in independent validation than the traditional method of integrating the data. These observations provide practical guidelines for handling batch effects in the development and evaluation of genomic classifiers.
more » « less
Full Text Available
The Cancer Microbiome: Distinguishing Direct and Indirect Effects Requires a Systemic View

https://doi.org/10.1016/j.trecan.2020.01.004

Xavier, Joao B. ; Young, Vincent B. ; Skufca, Joseph ; Ginty, Fiona ; Testerman, Traci ; Pearson, Alexander T. ; Macklin, Paul ; Mitchell, Amir ; Shmulevich, Ilya ; Xie, Lei ; et al ( March 2020 , Trends in Cancer)

Full Text Available